Identifying Lexical Relationships and Entailments with Distributional Semantics
نویسنده
چکیده
As the field of Natural Language Processing has developed, research has progressed on ambitious semantic tasks like Recognizing Textual Entailment (RTE). Systems that approach these tasks may perform sophisticated inference between sentences, but often depend heavily on lexical resources like WordNet to provide critical information about relationships and entailments between lexical items. However, lexical resources are expensive to create and maintain, and are never fully comprehensive. Distributional Semantics has long provided a method to automatically induce meaning representations for lexical items from large corpora with little or no annotation efforts. The resulting representations are excellent as proxies of semantic similarity: words will have similar representations if their semantic meanings are similar. Yet, knowing two words are similar does not tell us their relationship or whether one entails the other. We present several models for identifying specific relationships and entailments from distributional representations of lexical semantics. Broadly, this work falls into two distinct but related areas: the first predicts specific ontology relations and entailment decisions between lexical items devoid of context; and the second predicts specific lexical paraphrases in complete sentences. We provide insight and analysis of how and why our models are able to generalize to novel lexical items and improve upon prior work. We propose several shortand long-term extensions to our work. In the short term, we propose applying one of our hypernymy-detection models to other relationships and evaluating our more recent work in an end-to-end RTE system. In the long-term, we propose adding consistency constraints to our lexical relationship prediction, better integration of context into our lexical paraphrase model, and new distributional models for improving word representations.
منابع مشابه
Modeling the Non-Substitutability of Multiword Expressions with Distributional Semantics and a Log-Linear Model
Non-substitutability is a property of Multiword Expressions (MWEs) that often causes lexical rigidity and is relevant for most types of MWEs. Efficient identification of this property can result in the efficient identification of MWEs. In this work we propose using distributional semantics, in the form of word embeddings, to identify candidate substitutions for a candidate MWE and model its sub...
متن کاملUsing a Distributional Neighbourhood Graph to Enrich Semantic Frames in the Field of the Environment
This paper presents a semi-automatic method for identifying terms that evoke semantic frames (Fillmore, 1982). The method is tested as a means of identifying lexical units that can be added to existing frames or to new, related frames, using a large corpus on the environment. It is hypothesized that a method based on distributional semantics, which exploits the assumption that words that appear...
متن کاملRelations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
We consider the task of predicting lexical entailment using distributional vectors. We perform a novel qualitative analysis of one existing model which was previously shown to only measure the prototypicality of word pairs. We find that the model strongly learns to identify hypernyms using Hearst patterns, which are well known to be predictive of lexical relations. We present a novel model whic...
متن کاملCombining Distributional Semantics and Structured Data to Study Lexical Change
Abstract. Statistical Natural Language Processing (NLP) techniques allow to quantify lexical semantic change using large text corpora. Wordlevel results of these methods can be hard to analyse in the context of sets of semantically or linguistically related words. On the other hand, structured knowledge sources represent such relationships explicitly, but ignore the problem of semantic change. ...
متن کاملCan distributional approaches improve on Good Old-Fashioned Lexical Semantics?
In this position paper, I discuss some linguistic problems that computational work on lexical semantics has attempted to address in the past and the implications for alternative models which incorporate distributional information. I concentrate in particular on phenomena involving count/mass distinctions, where older approaches attempted to use lexical semantics in their models of syntax. I out...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016